Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 9785 |
| Missing cells | 1721 |
| Missing cells (%) | 1.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 15.6 MiB |
| Average record size in memory | 1.6 KiB |
Variable types
| Text | 14 |
|---|---|
| Categorical | 2 |
| Numeric | 1 |
| DateTime | 1 |
Bedrooms_df1 is highly imbalanced (51.8%) | Imbalance |
Bathrooms_df1 is highly imbalanced (51.6%) | Imbalance |
Price_df2 has 144 (1.5%) missing values | Missing |
Area_df2 has 151 (1.5%) missing values | Missing |
Bedrooms_df2 has 149 (1.5%) missing values | Missing |
Bathrooms_df2 has 151 (1.5%) missing values | Missing |
Floors has 155 (1.6%) missing values | Missing |
Amenities has 174 (1.8%) missing values | Missing |
Street name has 152 (1.6%) missing values | Missing |
Ward name has 170 (1.7%) missing values | Missing |
District name has 149 (1.5%) missing values | Missing |
Frontages has 175 (1.8%) missing values | Missing |
Main road has 151 (1.5%) missing values | Missing |
Listing ID has unique values | Unique |
Reproduction
| Analysis started | 2024-06-21 08:42:22.579853 |
|---|---|
| Analysis finished | 2024-06-21 08:42:28.123804 |
| Duration | 5.54 seconds |
| Software version | ydata-profiling v4.8.3 |
| Download configuration | config.json |
Price_df1
Text
| Distinct | 1074 |
|---|---|
| Distinct (%) | 11.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1004.3 KiB |
Length
| Max length | 16 |
|---|---|
| Median length | 14 |
| Mean length | 6.6399591 |
| Min length | 4 |
Characters and Unicode
| Total characters | 64972 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 493 ? |
|---|---|
| Unique (%) | 5.0% |
Sample
| 1st row | 3899000000 tỷ |
|---|---|
| 2nd row | Thỏa thuận |
| 3rd row | 12.8 tỷ |
| 4th row | 3.4 tỷ |
| 5th row | Thỏa thuận |
| Value | Count | Frequency (%) |
| tỷ | 8760 | |
| thỏa | 849 | 4.3% |
| thuận | 849 | 4.3% |
| 5.5 | 176 | 0.9% |
| triệu | 172 | 0.9% |
| 4.5 | 164 | 0.8% |
| 6.5 | 153 | 0.8% |
| 5 | 149 | 0.8% |
| 6 | 139 | 0.7% |
| 5.8 | 127 | 0.6% |
| Other values (1026) | 8032 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9785 | ||
| t | 9781 | |
| ỷ | 8760 | |
| . | 6556 | |
| 5 | 4115 | 6.3% |
| 0 | 3240 | 5.0% |
| 3 | 2213 | 3.4% |
| 4 | 2173 | 3.3% |
| 1 | 2171 | 3.3% |
| 6 | 1922 | 3.0% |
| Other values (16) | 14256 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 64972 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 9785 | ||
| t | 9781 | |
| ỷ | 8760 | |
| . | 6556 | |
| 5 | 4115 | 6.3% |
| 0 | 3240 | 5.0% |
| 3 | 2213 | 3.4% |
| 4 | 2173 | 3.3% |
| 1 | 2171 | 3.3% |
| 6 | 1922 | 3.0% |
| Other values (16) | 14256 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 64972 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 9785 | ||
| t | 9781 | |
| ỷ | 8760 | |
| . | 6556 | |
| 5 | 4115 | 6.3% |
| 0 | 3240 | 5.0% |
| 3 | 2213 | 3.4% |
| 4 | 2173 | 3.3% |
| 1 | 2171 | 3.3% |
| 6 | 1922 | 3.0% |
| Other values (16) | 14256 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 64972 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 9785 | ||
| t | 9781 | |
| ỷ | 8760 | |
| . | 6556 | |
| 5 | 4115 | 6.3% |
| 0 | 3240 | 5.0% |
| 3 | 2213 | 3.4% |
| 4 | 2173 | 3.3% |
| 1 | 2171 | 3.3% |
| 6 | 1922 | 3.0% |
| Other values (16) | 14256 |
Area_df1
Text
| Distinct | 501 |
|---|---|
| Distinct (%) | 5.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 587.9 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 4 |
| Mean length | 4.5121104 |
| Min length | 3 |
Characters and Unicode
| Total characters | 44151 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 219 ? |
|---|---|
| Unique (%) | 2.2% |
Sample
| 1st row | 150 m |
|---|---|
| 2nd row | No Area |
| 3rd row | 75 m |
| 4th row | 110 m |
| 5th row | 12 m |
| Value | Count | Frequency (%) |
| m | 8780 | |
| no | 1005 | 5.1% |
| area | 1005 | 5.1% |
| 60 | 468 | 2.4% |
| 40 | 329 | 1.7% |
| 80 | 326 | 1.7% |
| 48 | 289 | 1.5% |
| 50 | 281 | 1.4% |
| 52 | 245 | 1.3% |
| 45 | 218 | 1.1% |
| Other values (493) | 6624 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9785 | ||
| m | 8780 | |
| 0 | 3003 | 6.8% |
| 5 | 2726 | 6.2% |
| 4 | 2477 | 5.6% |
| 6 | 2175 | 4.9% |
| 1 | 1844 | 4.2% |
| 2 | 1789 | 4.1% |
| 8 | 1733 | 3.9% |
| 3 | 1639 | 3.7% |
| Other values (9) | 8200 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 44151 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 9785 | ||
| m | 8780 | |
| 0 | 3003 | 6.8% |
| 5 | 2726 | 6.2% |
| 4 | 2477 | 5.6% |
| 6 | 2175 | 4.9% |
| 1 | 1844 | 4.2% |
| 2 | 1789 | 4.1% |
| 8 | 1733 | 3.9% |
| 3 | 1639 | 3.7% |
| Other values (9) | 8200 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 44151 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 9785 | ||
| m | 8780 | |
| 0 | 3003 | 6.8% |
| 5 | 2726 | 6.2% |
| 4 | 2477 | 5.6% |
| 6 | 2175 | 4.9% |
| 1 | 1844 | 4.2% |
| 2 | 1789 | 4.1% |
| 8 | 1733 | 3.9% |
| 3 | 1639 | 3.7% |
| Other values (9) | 8200 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 44151 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 9785 | ||
| m | 8780 | |
| 0 | 3003 | 6.8% |
| 5 | 2726 | 6.2% |
| 4 | 2477 | 5.6% |
| 6 | 2175 | 4.9% |
| 1 | 1844 | 4.2% |
| 2 | 1789 | 4.1% |
| 8 | 1733 | 3.9% |
| 3 | 1639 | 3.7% |
| Other values (9) | 8200 |
Bedrooms_df1
Categorical
IMBALANCE 
| Distinct | 37 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 609.4 KiB |
| No Bedrooms | |
|---|---|
| 2 PN | |
| 3 PN | |
| 4 PN | |
| 5 PN | |
| Other values (32) |
Length
| Max length | 11 |
|---|---|
| Median length | 4 |
| Mean length | 6.7580991 |
| Min length | 4 |
Characters and Unicode
| Total characters | 66128 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 10 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 2 PN |
|---|---|
| 2nd row | No Bedrooms |
| 3rd row | 5 PN |
| 4th row | No Bedrooms |
| 5th row | No Bedrooms |
Common Values
| Value | Count | Frequency (%) |
| No Bedrooms | 3835 | |
| 2 PN | 1720 | |
| 3 PN | 1605 | |
| 4 PN | 1438 | 14.7% |
| 5 PN | 509 | 5.2% |
| 6 PN | 222 | 2.3% |
| 1 PN | 141 | 1.4% |
| 7 PN | 80 | 0.8% |
| 8 PN | 65 | 0.7% |
| 9 PN | 32 | 0.3% |
| Other values (27) | 138 | 1.4% |
Length
| Value | Count | Frequency (%) |
| pn | 5950 | |
| no | 3835 | |
| bedrooms | 3835 | |
| 2 | 1720 | 8.8% |
| 3 | 1605 | 8.2% |
| 4 | 1438 | 7.3% |
| 5 | 509 | 2.6% |
| 6 | 222 | 1.1% |
| 1 | 141 | 0.7% |
| 7 | 80 | 0.4% |
| Other values (29) | 235 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 11505 | |
| N | 9785 | |
| 9785 | ||
| P | 5950 | |
| e | 3835 | 5.8% |
| d | 3835 | 5.8% |
| r | 3835 | 5.8% |
| m | 3835 | 5.8% |
| s | 3835 | 5.8% |
| B | 3835 | 5.8% |
| Other values (10) | 6093 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 66128 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 11505 | |
| N | 9785 | |
| 9785 | ||
| P | 5950 | |
| e | 3835 | 5.8% |
| d | 3835 | 5.8% |
| r | 3835 | 5.8% |
| m | 3835 | 5.8% |
| s | 3835 | 5.8% |
| B | 3835 | 5.8% |
| Other values (10) | 6093 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 66128 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 11505 | |
| N | 9785 | |
| 9785 | ||
| P | 5950 | |
| e | 3835 | 5.8% |
| d | 3835 | 5.8% |
| r | 3835 | 5.8% |
| m | 3835 | 5.8% |
| s | 3835 | 5.8% |
| B | 3835 | 5.8% |
| Other values (10) | 6093 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 66128 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 11505 | |
| N | 9785 | |
| 9785 | ||
| P | 5950 | |
| e | 3835 | 5.8% |
| d | 3835 | 5.8% |
| r | 3835 | 5.8% |
| m | 3835 | 5.8% |
| s | 3835 | 5.8% |
| B | 3835 | 5.8% |
| Other values (10) | 6093 |
Bathrooms_df1
Categorical
IMBALANCE 
| Distinct | 36 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 615.4 KiB |
| No Bathrooms | |
|---|---|
| 2 WC | |
| 3 WC | |
| 4 WC | |
| 5 WC | |
| Other values (31) |
Length
| Max length | 12 |
|---|---|
| Median length | 4 |
| Mean length | 7.3931528 |
| Min length | 4 |
Characters and Unicode
| Total characters | 72342 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 11 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 1 WC |
|---|---|
| 2nd row | No Bathrooms |
| 3rd row | 6 WC |
| 4th row | No Bathrooms |
| 5th row | No Bathrooms |
Common Values
| Value | Count | Frequency (%) |
| No Bathrooms | 4134 | |
| 2 WC | 1862 | |
| 3 WC | 1396 | 14.3% |
| 4 WC | 955 | 9.8% |
| 5 WC | 559 | 5.7% |
| 1 WC | 329 | 3.4% |
| 6 WC | 242 | 2.5% |
| 7 WC | 99 | 1.0% |
| 8 WC | 68 | 0.7% |
| 10 WC | 30 | 0.3% |
| Other values (26) | 111 | 1.1% |
Length
| Value | Count | Frequency (%) |
| wc | 5651 | |
| no | 4134 | |
| bathrooms | 4134 | |
| 2 | 1862 | 9.5% |
| 3 | 1396 | 7.1% |
| 4 | 955 | 4.9% |
| 5 | 559 | 2.9% |
| 1 | 329 | 1.7% |
| 6 | 242 | 1.2% |
| 7 | 99 | 0.5% |
| Other values (28) | 209 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 12402 | |
| 9785 | ||
| W | 5651 | |
| C | 5651 | |
| r | 4134 | 5.7% |
| s | 4134 | 5.7% |
| m | 4134 | 5.7% |
| N | 4134 | 5.7% |
| h | 4134 | 5.7% |
| t | 4134 | 5.7% |
| Other values (12) | 14049 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 72342 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 12402 | |
| 9785 | ||
| W | 5651 | |
| C | 5651 | |
| r | 4134 | 5.7% |
| s | 4134 | 5.7% |
| m | 4134 | 5.7% |
| N | 4134 | 5.7% |
| h | 4134 | 5.7% |
| t | 4134 | 5.7% |
| Other values (12) | 14049 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 72342 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 12402 | |
| 9785 | ||
| W | 5651 | |
| C | 5651 | |
| r | 4134 | 5.7% |
| s | 4134 | 5.7% |
| m | 4134 | 5.7% |
| N | 4134 | 5.7% |
| h | 4134 | 5.7% |
| t | 4134 | 5.7% |
| Other values (12) | 14049 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 72342 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 12402 | |
| 9785 | ||
| W | 5651 | |
| C | 5651 | |
| r | 4134 | 5.7% |
| s | 4134 | 5.7% |
| m | 4134 | 5.7% |
| N | 4134 | 5.7% |
| h | 4134 | 5.7% |
| t | 4134 | 5.7% |
| Other values (12) | 14049 |
Address
Text
| Distinct | 2560 |
|---|---|
| Distinct (%) | 26.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.2 MiB |
Length
| Max length | 255 |
|---|---|
| Median length | 10 |
| Mean length | 20.05161 |
| Min length | 1 |
Characters and Unicode
| Total characters | 196205 |
|---|---|
| Distinct characters | 201 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1935 ? |
|---|---|
| Unique (%) | 19.8% |
Sample
| 1st row | Thành phố Hồ Chí Minh |
|---|---|
| 2nd row | No Address |
| 3rd row | Đường Xóm Chiếu |
| 4th row | Nhà Chính chủ hẻm taxi Mễ Cốc, P15, Q8. |
| 5th row | No Address |
| Value | Count | Frequency (%) |
| no | 4772 | 11.3% |
| address | 4772 | 11.3% |
| minh | 1470 | 3.5% |
| quận | 1457 | 3.4% |
| phường | 1321 | 3.1% |
| bình | 1321 | 3.1% |
| tân | 1296 | 3.1% |
| chí | 1240 | 2.9% |
| hồ | 1235 | 2.9% |
| đường | 952 | 2.3% |
| Other values (1301) | 22450 |
Most occurring characters
| Value | Count | Frequency (%) |
| 32601 | 16.6% | |
| n | 14709 | 7.5% |
| h | 14106 | 7.2% |
| s | 9685 | 4.9% |
| d | 9574 | 4.9% |
| N | 6475 | 3.3% |
| T | 5962 | 3.0% |
| o | 5825 | 3.0% |
| r | 5536 | 2.8% |
| , | 5330 | 2.7% |
| Other values (191) | 86402 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 196205 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 32601 | 16.6% | |
| n | 14709 | 7.5% |
| h | 14106 | 7.2% |
| s | 9685 | 4.9% |
| d | 9574 | 4.9% |
| N | 6475 | 3.3% |
| T | 5962 | 3.0% |
| o | 5825 | 3.0% |
| r | 5536 | 2.8% |
| , | 5330 | 2.7% |
| Other values (191) | 86402 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 196205 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 32601 | 16.6% | |
| n | 14709 | 7.5% |
| h | 14106 | 7.2% |
| s | 9685 | 4.9% |
| d | 9574 | 4.9% |
| N | 6475 | 3.3% |
| T | 5962 | 3.0% |
| o | 5825 | 3.0% |
| r | 5536 | 2.8% |
| , | 5330 | 2.7% |
| Other values (191) | 86402 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 196205 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 32601 | 16.6% | |
| n | 14709 | 7.5% |
| h | 14106 | 7.2% |
| s | 9685 | 4.9% |
| d | 9574 | 4.9% |
| N | 6475 | 3.3% |
| T | 5962 | 3.0% |
| o | 5825 | 3.0% |
| r | 5536 | 2.8% |
| , | 5330 | 2.7% |
| Other values (191) | 86402 |
Listing ID
Real number (ℝ)
UNIQUE 
| Distinct | 9785 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 267877.37 |
| Minimum | 250383 |
|---|---|
| Maximum | 285164 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 76.6 KiB |
Quantile statistics
| Minimum | 250383 |
|---|---|
| 5-th percentile | 251918.4 |
| Q1 | 258923 |
| median | 267513 |
| Q3 | 277078 |
| 95-th percentile | 283193.2 |
| Maximum | 285164 |
| Range | 34781 |
| Interquartile range (IQR) | 18155 |
Descriptive statistics
| Standard deviation | 10094.881 |
|---|---|
| Coefficient of variation (CV) | 0.037684709 |
| Kurtosis | -1.2343745 |
| Mean | 267877.37 |
| Median Absolute Deviation (MAD) | 8996 |
| Skewness | -0.011984764 |
| Sum | 2.6211801 × 109 |
| Variance | 1.0190662 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 285164 | 1 | < 0.1% |
| 257378 | 1 | < 0.1% |
| 257402 | 1 | < 0.1% |
| 257394 | 1 | < 0.1% |
| 257386 | 1 | < 0.1% |
| 257384 | 1 | < 0.1% |
| 257381 | 1 | < 0.1% |
| 257380 | 1 | < 0.1% |
| 257379 | 1 | < 0.1% |
| 257377 | 1 | < 0.1% |
| Other values (9775) | 9775 |
| Value | Count | Frequency (%) |
| 250383 | 1 | |
| 250385 | 1 | |
| 250394 | 1 | |
| 250397 | 1 | |
| 250400 | 1 | |
| 250406 | 1 | |
| 250407 | 1 | |
| 250408 | 1 | |
| 250411 | 1 | |
| 250413 | 1 |
| Value | Count | Frequency (%) |
| 285164 | 1 | |
| 285163 | 1 | |
| 285162 | 1 | |
| 285161 | 1 | |
| 285159 | 1 | |
| 285145 | 1 | |
| 285142 | 1 | |
| 285135 | 1 | |
| 285129 | 1 | |
| 285126 | 1 |
Date
Date
| Distinct | 9196 |
|---|---|
| Distinct (%) | 94.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.6 KiB |
| Minimum | 2022-09-06 07:16:00 |
|---|---|
| Maximum | 2023-12-10 20:49:00 |
Price_df2
Text
MISSING 
| Distinct | 2517 |
|---|---|
| Distinct (%) | 26.1% |
| Missing | 144 |
| Missing (%) | 1.5% |
| Memory size | 1.0 MiB |
Length
| Max length | 179 |
|---|---|
| Median length | 97 |
| Mean length | 10.044601 |
| Min length | 1 |
Characters and Unicode
| Total characters | 96840 |
|---|---|
| Distinct characters | 141 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1521 ? |
|---|---|
| Unique (%) | 15.8% |
Sample
| 1st row | 12,8 tỷ VND |
|---|---|
| 2nd row | 8,99 tỷ |
| 3rd row | 3.4 tỷ VND |
| 4th row | 7.39 tỷ VND |
| 5th row | 16 tỷ VND |
| Value | Count | Frequency (%) |
| tỷ | 8588 | |
| vnd | 4935 | |
| 4 | 532 | 1.9% |
| 3 | 496 | 1.8% |
| triệu | 471 | 1.7% |
| not | 459 | 1.6% |
| 5 | 457 | 1.6% |
| the | 366 | 1.3% |
| in | 354 | 1.3% |
| 6 | 343 | 1.2% |
| Other values (1398) | 11136 |
Most occurring characters
| Value | Count | Frequency (%) |
| 18496 | ||
| t | 10957 | 11.3% |
| ỷ | 8623 | 8.9% |
| N | 5417 | 5.6% |
| D | 4962 | 5.1% |
| V | 4951 | 5.1% |
| . | 4027 | 4.2% |
| 5 | 3805 | 3.9% |
| 1 | 2241 | 2.3% |
| i | 2205 | 2.3% |
| Other values (131) | 31156 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 96840 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 18496 | ||
| t | 10957 | 11.3% |
| ỷ | 8623 | 8.9% |
| N | 5417 | 5.6% |
| D | 4962 | 5.1% |
| V | 4951 | 5.1% |
| . | 4027 | 4.2% |
| 5 | 3805 | 3.9% |
| 1 | 2241 | 2.3% |
| i | 2205 | 2.3% |
| Other values (131) | 31156 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 96840 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 18496 | ||
| t | 10957 | 11.3% |
| ỷ | 8623 | 8.9% |
| N | 5417 | 5.6% |
| D | 4962 | 5.1% |
| V | 4951 | 5.1% |
| . | 4027 | 4.2% |
| 5 | 3805 | 3.9% |
| 1 | 2241 | 2.3% |
| i | 2205 | 2.3% |
| Other values (131) | 31156 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 96840 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 18496 | ||
| t | 10957 | 11.3% |
| ỷ | 8623 | 8.9% |
| N | 5417 | 5.6% |
| D | 4962 | 5.1% |
| V | 4951 | 5.1% |
| . | 4027 | 4.2% |
| 5 | 3805 | 3.9% |
| 1 | 2241 | 2.3% |
| i | 2205 | 2.3% |
| Other values (131) | 31156 |
Area_df2
Text
MISSING 
| Distinct | 4356 |
|---|---|
| Distinct (%) | 45.2% |
| Missing | 151 |
| Missing (%) | 1.5% |
| Memory size | 733.7 KiB |
Length
| Max length | 147 |
|---|---|
| Median length | 109 |
| Mean length | 11.403674 |
| Min length | 2 |
Characters and Unicode
| Total characters | 109863 |
|---|---|
| Distinct characters | 141 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3511 ? |
|---|---|
| Unique (%) | 36.4% |
Sample
| 1st row | 75m2 (3,2m x 21m) |
|---|---|
| 2nd row | 85m ngang 8m |
| 3rd row | 105m |
| 4th row | 53m2 |
| 5th row | 4x20m |
| Value | Count | Frequency (%) |
| x | 2235 | 9.0% |
| m2 | 1043 | 4.2% |
| not | 752 | 3.0% |
| the | 553 | 2.2% |
| in | 527 | 2.1% |
| specified | 438 | 1.8% |
| text | 426 | 1.7% |
| 4m | 397 | 1.6% |
| ngang | 368 | 1.5% |
| 4 | 344 | 1.4% |
| Other values (2982) | 17873 |
Most occurring characters
| Value | Count | Frequency (%) |
| 15328 | 14.0% | |
| m | 12393 | 11.3% |
| 2 | 10160 | 9.2% |
| 1 | 5821 | 5.3% |
| 4 | 4984 | 4.5% |
| x | 4962 | 4.5% |
| 5 | 4939 | 4.5% |
| 0 | 3476 | 3.2% |
| t | 3384 | 3.1% |
| n | 3195 | 2.9% |
| Other values (131) | 41221 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 109863 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 15328 | 14.0% | |
| m | 12393 | 11.3% |
| 2 | 10160 | 9.2% |
| 1 | 5821 | 5.3% |
| 4 | 4984 | 4.5% |
| x | 4962 | 4.5% |
| 5 | 4939 | 4.5% |
| 0 | 3476 | 3.2% |
| t | 3384 | 3.1% |
| n | 3195 | 2.9% |
| Other values (131) | 41221 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 109863 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 15328 | 14.0% | |
| m | 12393 | 11.3% |
| 2 | 10160 | 9.2% |
| 1 | 5821 | 5.3% |
| 4 | 4984 | 4.5% |
| x | 4962 | 4.5% |
| 5 | 4939 | 4.5% |
| 0 | 3476 | 3.2% |
| t | 3384 | 3.1% |
| n | 3195 | 2.9% |
| Other values (131) | 41221 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 109863 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 15328 | 14.0% | |
| m | 12393 | 11.3% |
| 2 | 10160 | 9.2% |
| 1 | 5821 | 5.3% |
| 4 | 4984 | 4.5% |
| x | 4962 | 4.5% |
| 5 | 4939 | 4.5% |
| 0 | 3476 | 3.2% |
| t | 3384 | 3.1% |
| n | 3195 | 2.9% |
| Other values (131) | 41221 |
Bedrooms_df2
Text
MISSING 
| Distinct | 259 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 149 |
| Missing (%) | 1.5% |
| Memory size | 607.5 KiB |
Length
| Max length | 149 |
|---|---|
| Median length | 1 |
| Mean length | 5.205687 |
| Min length | 1 |
Characters and Unicode
| Total characters | 50162 |
|---|---|
| Distinct characters | 104 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 181 ? |
|---|---|
| Unique (%) | 1.9% |
Sample
| 1st row | 5 |
|---|---|
| 2nd row | 7 |
| 3rd row | 4 |
| 4th row | 4 |
| 5th row | Not mentioned |
| Value | Count | Frequency (%) |
| not | 2578 | |
| 2 | 1998 | |
| 3 | 1895 | |
| 4 | 1703 | |
| specified | 1359 | |
| mentioned | 1115 | |
| 5 | 586 | 4.1% |
| 6 | 278 | 2.0% |
| the | 219 | 1.5% |
| in | 204 | 1.4% |
| Other values (250) | 2211 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 5571 | 11.1% |
| t | 4559 | 9.1% |
| 4510 | 9.0% | |
| i | 4419 | 8.8% |
| o | 4006 | 8.0% |
| n | 3219 | 6.4% |
| d | 2746 | 5.5% |
| N | 2628 | 5.2% |
| 2 | 2073 | 4.1% |
| 3 | 1933 | 3.9% |
| Other values (94) | 14498 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 50162 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 5571 | 11.1% |
| t | 4559 | 9.1% |
| 4510 | 9.0% | |
| i | 4419 | 8.8% |
| o | 4006 | 8.0% |
| n | 3219 | 6.4% |
| d | 2746 | 5.5% |
| N | 2628 | 5.2% |
| 2 | 2073 | 4.1% |
| 3 | 1933 | 3.9% |
| Other values (94) | 14498 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 50162 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 5571 | 11.1% |
| t | 4559 | 9.1% |
| 4510 | 9.0% | |
| i | 4419 | 8.8% |
| o | 4006 | 8.0% |
| n | 3219 | 6.4% |
| d | 2746 | 5.5% |
| N | 2628 | 5.2% |
| 2 | 2073 | 4.1% |
| 3 | 1933 | 3.9% |
| Other values (94) | 14498 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 50162 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 5571 | 11.1% |
| t | 4559 | 9.1% |
| 4510 | 9.0% | |
| i | 4419 | 8.8% |
| o | 4006 | 8.0% |
| n | 3219 | 6.4% |
| d | 2746 | 5.5% |
| N | 2628 | 5.2% |
| 2 | 2073 | 4.1% |
| 3 | 1933 | 3.9% |
| Other values (94) | 14498 |
Bathrooms_df2
Text
MISSING 
| Distinct | 198 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 151 |
| Missing (%) | 1.5% |
| Memory size | 622.7 KiB |
Length
| Max length | 94 |
|---|---|
| Median length | 1 |
| Mean length | 7.3130579 |
| Min length | 1 |
Characters and Unicode
| Total characters | 70454 |
|---|---|
| Distinct characters | 96 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 129 ? |
|---|---|
| Unique (%) | 1.3% |
Sample
| 1st row | 6 |
|---|---|
| 2nd row | 3 |
| 3rd row | 3 |
| 4th row | 5 |
| 5th row | Not mentioned |
| Value | Count | Frequency (%) |
| not | 4147 | |
| specified | 2238 | |
| 2 | 1915 | |
| mentioned | 1775 | |
| 3 | 1356 | 8.4% |
| 4 | 793 | 4.9% |
| 5 | 528 | 3.3% |
| the | 406 | 2.5% |
| in | 387 | 2.4% |
| 1 | 354 | 2.2% |
| Other values (194) | 2168 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 9007 | |
| t | 7392 | |
| i | 7207 | |
| 6433 | ||
| o | 6251 | 8.9% |
| n | 4597 | 6.5% |
| d | 4321 | 6.1% |
| N | 4124 | 5.9% |
| c | 2549 | 3.6% |
| p | 2548 | 3.6% |
| Other values (86) | 16025 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 70454 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 9007 | |
| t | 7392 | |
| i | 7207 | |
| 6433 | ||
| o | 6251 | 8.9% |
| n | 4597 | 6.5% |
| d | 4321 | 6.1% |
| N | 4124 | 5.9% |
| c | 2549 | 3.6% |
| p | 2548 | 3.6% |
| Other values (86) | 16025 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 70454 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 9007 | |
| t | 7392 | |
| i | 7207 | |
| 6433 | ||
| o | 6251 | 8.9% |
| n | 4597 | 6.5% |
| d | 4321 | 6.1% |
| N | 4124 | 5.9% |
| c | 2549 | 3.6% |
| p | 2548 | 3.6% |
| Other values (86) | 16025 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 70454 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 9007 | |
| t | 7392 | |
| i | 7207 | |
| 6433 | ||
| o | 6251 | 8.9% |
| n | 4597 | 6.5% |
| d | 4321 | 6.1% |
| N | 4124 | 5.9% |
| c | 2549 | 3.6% |
| p | 2548 | 3.6% |
| Other values (86) | 16025 |
Floors
Text
MISSING 
| Distinct | 615 |
|---|---|
| Distinct (%) | 6.4% |
| Missing | 155 |
| Missing (%) | 1.6% |
| Memory size | 746.8 KiB |
Length
| Max length | 127 |
|---|---|
| Median length | 1 |
| Mean length | 6.3073728 |
| Min length | 1 |
Characters and Unicode
| Total characters | 60740 |
|---|---|
| Distinct characters | 112 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 425 ? |
|---|---|
| Unique (%) | 4.4% |
Sample
| 1st row | 4 |
|---|---|
| 2nd row | 1 trệt, 1 lầu |
| 3rd row | 2 |
| 4th row | 4 |
| 5th row | Not mentioned |
| Value | Count | Frequency (%) |
| 1 | 3403 | |
| 2 | 2958 | |
| 3 | 2071 | |
| lầu | 1941 | |
| trệt | 1759 | |
| 4 | 1319 | 6.7% |
| not | 1180 | 6.0% |
| 5 | 722 | 3.7% |
| mentioned | 629 | 3.2% |
| specified | 482 | 2.4% |
| Other values (219) | 3269 |
Most occurring characters
| Value | Count | Frequency (%) |
| 10104 | ||
| t | 6276 | 10.3% |
| 1 | 3470 | 5.7% |
| 2 | 2970 | 4.9% |
| n | 2904 | 4.8% |
| e | 2619 | 4.3% |
| l | 2581 | 4.2% |
| o | 2575 | 4.2% |
| ầ | 2422 | 4.0% |
| r | 2320 | 3.8% |
| Other values (102) | 22499 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 60740 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 10104 | ||
| t | 6276 | 10.3% |
| 1 | 3470 | 5.7% |
| 2 | 2970 | 4.9% |
| n | 2904 | 4.8% |
| e | 2619 | 4.3% |
| l | 2581 | 4.2% |
| o | 2575 | 4.2% |
| ầ | 2422 | 4.0% |
| r | 2320 | 3.8% |
| Other values (102) | 22499 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 60740 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 10104 | ||
| t | 6276 | 10.3% |
| 1 | 3470 | 5.7% |
| 2 | 2970 | 4.9% |
| n | 2904 | 4.8% |
| e | 2619 | 4.3% |
| l | 2581 | 4.2% |
| o | 2575 | 4.2% |
| ầ | 2422 | 4.0% |
| r | 2320 | 3.8% |
| Other values (102) | 22499 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 60740 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 10104 | ||
| t | 6276 | 10.3% |
| 1 | 3470 | 5.7% |
| 2 | 2970 | 4.9% |
| n | 2904 | 4.8% |
| e | 2619 | 4.3% |
| l | 2581 | 4.2% |
| o | 2575 | 4.2% |
| ầ | 2422 | 4.0% |
| r | 2320 | 3.8% |
| Other values (102) | 22499 |
Amenities
Text
MISSING 
| Distinct | 8053 |
|---|---|
| Distinct (%) | 83.8% |
| Missing | 174 |
| Missing (%) | 1.8% |
| Memory size | 2.7 MiB |
Length
| Max length | 478 |
|---|---|
| Median length | 324 |
| Mean length | 67.111019 |
| Min length | 3 |
Characters and Unicode
| Total characters | 645004 |
|---|---|
| Distinct characters | 212 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 7545 ? |
|---|---|
| Unique (%) | 78.5% |
Sample
| 1st row | Sân Thượng, Bếp, Mặt bằng buôn bán |
|---|---|
| 2nd row | khu vực xây dựng nhà cao tầng, ô tô ngủ trong nhà |
| 3rd row | Sân thượng, Nhà xây dựng kiên cố, Hẻm taxi tận cửa |
| 4th row | Khu vực an ninh, dân trí cao, gần mặt tiền |
| 5th row | An ninh cao, Gần công viên đi bộ, Đường 18m có vỉa hè, Hiếm nhà có đầy đủ công năng |
| Value | Count | Frequency (%) |
| công | 2500 | 1.9% |
| sân | 2247 | 1.7% |
| gần | 2104 | 1.6% |
| phòng | 2089 | 1.5% |
| trường | 2055 | 1.5% |
| chợ | 1933 | 1.4% |
| nhà | 1872 | 1.4% |
| tiện | 1802 | 1.3% |
| học | 1716 | 1.3% |
| xe | 1608 | 1.2% |
| Other values (3725) | 115012 |
Most occurring characters
| Value | Count | Frequency (%) |
| 125335 | ||
| n | 60672 | 9.4% |
| h | 44189 | 6.9% |
| , | 28867 | 4.5% |
| g | 28334 | 4.4% |
| t | 27948 | 4.3% |
| c | 25243 | 3.9% |
| i | 22901 | 3.6% |
| a | 16588 | 2.6% |
| u | 13618 | 2.1% |
| Other values (202) | 251309 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 645004 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 125335 | ||
| n | 60672 | 9.4% |
| h | 44189 | 6.9% |
| , | 28867 | 4.5% |
| g | 28334 | 4.4% |
| t | 27948 | 4.3% |
| c | 25243 | 3.9% |
| i | 22901 | 3.6% |
| a | 16588 | 2.6% |
| u | 13618 | 2.1% |
| Other values (202) | 251309 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 645004 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 125335 | ||
| n | 60672 | 9.4% |
| h | 44189 | 6.9% |
| , | 28867 | 4.5% |
| g | 28334 | 4.4% |
| t | 27948 | 4.3% |
| c | 25243 | 3.9% |
| i | 22901 | 3.6% |
| a | 16588 | 2.6% |
| u | 13618 | 2.1% |
| Other values (202) | 251309 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 645004 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 125335 | ||
| n | 60672 | 9.4% |
| h | 44189 | 6.9% |
| , | 28867 | 4.5% |
| g | 28334 | 4.4% |
| t | 27948 | 4.3% |
| c | 25243 | 3.9% |
| i | 22901 | 3.6% |
| a | 16588 | 2.6% |
| u | 13618 | 2.1% |
| Other values (202) | 251309 |
Street name
Text
MISSING 
| Distinct | 2028 |
|---|---|
| Distinct (%) | 21.1% |
| Missing | 152 |
| Missing (%) | 1.6% |
| Memory size | 1.1 MiB |
Length
| Max length | 137 |
|---|---|
| Median length | 94 |
| Mean length | 13.505346 |
| Min length | 2 |
Characters and Unicode
| Total characters | 130097 |
|---|---|
| Distinct characters | 192 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1255 ? |
|---|---|
| Unique (%) | 13.0% |
Sample
| 1st row | Nguyễn Tất Thành |
|---|---|
| 2nd row | Hòa Bình |
| 3rd row | Mễ Cốc |
| 4th row | Phan Huy Ích |
| 5th row | Bùi Tá Hán |
| Value | Count | Frequency (%) |
| văn | 1699 | 5.9% |
| not | 1494 | 5.2% |
| nguyễn | 1079 | 3.7% |
| đường | 986 | 3.4% |
| lê | 881 | 3.1% |
| mentioned | 757 | 2.6% |
| số | 620 | 2.2% |
| specified | 592 | 2.1% |
| phan | 422 | 1.5% |
| quang | 378 | 1.3% |
| Other values (1069) | 19918 |
Most occurring characters
| Value | Count | Frequency (%) |
| 19193 | 14.8% | |
| n | 14391 | 11.1% |
| h | 6453 | 5.0% |
| g | 6252 | 4.8% |
| i | 5506 | 4.2% |
| t | 4106 | 3.2% |
| T | 3985 | 3.1% |
| u | 3985 | 3.1% |
| o | 3744 | 2.9% |
| e | 3693 | 2.8% |
| Other values (182) | 58789 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 130097 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 19193 | 14.8% | |
| n | 14391 | 11.1% |
| h | 6453 | 5.0% |
| g | 6252 | 4.8% |
| i | 5506 | 4.2% |
| t | 4106 | 3.2% |
| T | 3985 | 3.1% |
| u | 3985 | 3.1% |
| o | 3744 | 2.9% |
| e | 3693 | 2.8% |
| Other values (182) | 58789 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 130097 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 19193 | 14.8% | |
| n | 14391 | 11.1% |
| h | 6453 | 5.0% |
| g | 6252 | 4.8% |
| i | 5506 | 4.2% |
| t | 4106 | 3.2% |
| T | 3985 | 3.1% |
| u | 3985 | 3.1% |
| o | 3744 | 2.9% |
| e | 3693 | 2.8% |
| Other values (182) | 58789 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 130097 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 19193 | 14.8% | |
| n | 14391 | 11.1% |
| h | 6453 | 5.0% |
| g | 6252 | 4.8% |
| i | 5506 | 4.2% |
| t | 4106 | 3.2% |
| T | 3985 | 3.1% |
| u | 3985 | 3.1% |
| o | 3744 | 2.9% |
| e | 3693 | 2.8% |
| Other values (182) | 58789 |
Ward name
Text
MISSING 
| Distinct | 733 |
|---|---|
| Distinct (%) | 7.6% |
| Missing | 170 |
| Missing (%) | 1.7% |
| Memory size | 897.8 KiB |
Length
| Max length | 75 |
|---|---|
| Median length | 68 |
| Mean length | 11.949766 |
| Min length | 1 |
Characters and Unicode
| Total characters | 114897 |
|---|---|
| Distinct characters | 148 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 381 ? |
|---|---|
| Unique (%) | 4.0% |
Sample
| 1st row | Not mentioned |
|---|---|
| 2nd row | Tân Phú |
| 3rd row | P15 |
| 4th row | Not specified |
| 5th row | An Phú |
| Value | Count | Frequency (%) |
| not | 3893 | |
| phường | 2479 | 10.8% |
| mentioned | 1949 | 8.5% |
| specified | 1678 | 7.3% |
| bình | 779 | 3.4% |
| tân | 706 | 3.1% |
| in | 601 | 2.6% |
| the | 600 | 2.6% |
| phú | 498 | 2.2% |
| text | 463 | 2.0% |
| Other values (407) | 9262 |
Most occurring characters
| Value | Count | Frequency (%) |
| 13293 | 11.6% | |
| n | 11677 | 10.2% |
| e | 8731 | 7.6% |
| t | 7876 | 6.9% |
| i | 7576 | 6.6% |
| h | 6536 | 5.7% |
| o | 6465 | 5.6% |
| d | 4186 | 3.6% |
| N | 4100 | 3.6% |
| g | 4051 | 3.5% |
| Other values (138) | 40406 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 114897 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 13293 | 11.6% | |
| n | 11677 | 10.2% |
| e | 8731 | 7.6% |
| t | 7876 | 6.9% |
| i | 7576 | 6.6% |
| h | 6536 | 5.7% |
| o | 6465 | 5.6% |
| d | 4186 | 3.6% |
| N | 4100 | 3.6% |
| g | 4051 | 3.5% |
| Other values (138) | 40406 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 114897 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 13293 | 11.6% | |
| n | 11677 | 10.2% |
| e | 8731 | 7.6% |
| t | 7876 | 6.9% |
| i | 7576 | 6.6% |
| h | 6536 | 5.7% |
| o | 6465 | 5.6% |
| d | 4186 | 3.6% |
| N | 4100 | 3.6% |
| g | 4051 | 3.5% |
| Other values (138) | 40406 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 114897 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 13293 | 11.6% | |
| n | 11677 | 10.2% |
| e | 8731 | 7.6% |
| t | 7876 | 6.9% |
| i | 7576 | 6.6% |
| h | 6536 | 5.7% |
| o | 6465 | 5.6% |
| d | 4186 | 3.6% |
| N | 4100 | 3.6% |
| g | 4051 | 3.5% |
| Other values (138) | 40406 |
District name
Text
MISSING 
| Distinct | 539 |
|---|---|
| Distinct (%) | 5.6% |
| Missing | 149 |
| Missing (%) | 1.5% |
| Memory size | 972.5 KiB |
Length
| Max length | 94 |
|---|---|
| Median length | 82 |
| Mean length | 9.7567455 |
| Min length | 1 |
Characters and Unicode
| Total characters | 94016 |
|---|---|
| Distinct characters | 147 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 364 ? |
|---|---|
| Unique (%) | 3.8% |
Sample
| 1st row | Quận 4 |
|---|---|
| 2nd row | Tân Phú |
| 3rd row | Q8 |
| 4th row | Tân Bình |
| 5th row | Quận 2 |
| Value | Count | Frequency (%) |
| quận | 3098 | |
| bình | 2529 | 11.4% |
| tân | 2022 | 9.1% |
| not | 1507 | 6.8% |
| gò | 1095 | 4.9% |
| vấp | 1091 | 4.9% |
| phú | 1052 | 4.7% |
| thạnh | 992 | 4.5% |
| đức | 731 | 3.3% |
| thủ | 724 | 3.3% |
| Other values (364) | 7401 |
Most occurring characters
| Value | Count | Frequency (%) |
| 12606 | 13.4% | |
| n | 11858 | 12.6% |
| h | 7703 | 8.2% |
| T | 3982 | 4.2% |
| u | 3800 | 4.0% |
| ậ | 3618 | 3.8% |
| e | 3552 | 3.8% |
| t | 3480 | 3.7% |
| Q | 3183 | 3.4% |
| i | 3043 | 3.2% |
| Other values (137) | 37191 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 94016 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 12606 | 13.4% | |
| n | 11858 | 12.6% |
| h | 7703 | 8.2% |
| T | 3982 | 4.2% |
| u | 3800 | 4.0% |
| ậ | 3618 | 3.8% |
| e | 3552 | 3.8% |
| t | 3480 | 3.7% |
| Q | 3183 | 3.4% |
| i | 3043 | 3.2% |
| Other values (137) | 37191 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 94016 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 12606 | 13.4% | |
| n | 11858 | 12.6% |
| h | 7703 | 8.2% |
| T | 3982 | 4.2% |
| u | 3800 | 4.0% |
| ậ | 3618 | 3.8% |
| e | 3552 | 3.8% |
| t | 3480 | 3.7% |
| Q | 3183 | 3.4% |
| i | 3043 | 3.2% |
| Other values (137) | 37191 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 94016 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 12606 | 13.4% | |
| n | 11858 | 12.6% |
| h | 7703 | 8.2% |
| T | 3982 | 4.2% |
| u | 3800 | 4.0% |
| ậ | 3618 | 3.8% |
| e | 3552 | 3.8% |
| t | 3480 | 3.7% |
| Q | 3183 | 3.4% |
| i | 3043 | 3.2% |
| Other values (137) | 37191 |
Frontages
Text
MISSING 
| Distinct | 700 |
|---|---|
| Distinct (%) | 7.3% |
| Missing | 175 |
| Missing (%) | 1.8% |
| Memory size | 662.9 KiB |
Length
| Max length | 101 |
|---|---|
| Median length | 82 |
| Mean length | 10.346722 |
| Min length | 1 |
Characters and Unicode
| Total characters | 99432 |
|---|---|
| Distinct characters | 131 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 499 ? |
|---|---|
| Unique (%) | 5.2% |
Sample
| 1st row | Not mentioned |
|---|---|
| 2nd row | 2 |
| 3rd row | 37m2 |
| 4th row | Not specified |
| 5th row | 4m |
| Value | Count | Frequency (%) |
| not | 5533 | |
| mentioned | 2808 | |
| specified | 2369 | |
| the | 781 | 4.1% |
| in | 759 | 3.9% |
| text | 555 | 2.9% |
| 2 | 551 | 2.9% |
| 1 | 542 | 2.8% |
| provided | 334 | 1.7% |
| 4m | 328 | 1.7% |
| Other values (593) | 4691 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 12296 | |
| t | 11175 | |
| 9641 | ||
| i | 9591 | |
| o | 8825 | |
| n | 7630 | 7.7% |
| d | 5913 | 5.9% |
| N | 5442 | 5.5% |
| m | 5183 | 5.2% |
| p | 2908 | 2.9% |
| Other values (121) | 20828 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 99432 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 12296 | |
| t | 11175 | |
| 9641 | ||
| i | 9591 | |
| o | 8825 | |
| n | 7630 | 7.7% |
| d | 5913 | 5.9% |
| N | 5442 | 5.5% |
| m | 5183 | 5.2% |
| p | 2908 | 2.9% |
| Other values (121) | 20828 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 99432 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 12296 | |
| t | 11175 | |
| 9641 | ||
| i | 9591 | |
| o | 8825 | |
| n | 7630 | 7.7% |
| d | 5913 | 5.9% |
| N | 5442 | 5.5% |
| m | 5183 | 5.2% |
| p | 2908 | 2.9% |
| Other values (121) | 20828 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 99432 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 12296 | |
| t | 11175 | |
| 9641 | ||
| i | 9591 | |
| o | 8825 | |
| n | 7630 | 7.7% |
| d | 5913 | 5.9% |
| N | 5442 | 5.5% |
| m | 5183 | 5.2% |
| p | 2908 | 2.9% |
| Other values (121) | 20828 |
Main road
Text
MISSING 
| Distinct | 1166 |
|---|---|
| Distinct (%) | 12.1% |
| Missing | 151 |
| Missing (%) | 1.5% |
| Memory size | 1.0 MiB |
Length
| Max length | 133 |
|---|---|
| Median length | 127 |
| Mean length | 13.401702 |
| Min length | 3 |
Characters and Unicode
| Total characters | 129112 |
|---|---|
| Distinct characters | 154 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 980 ? |
|---|---|
| Unique (%) | 10.2% |
Sample
| 1st row | Mặt tiền ngay trục đường kinh doanh nhộn nhịp |
|---|---|
| 2nd row | ngay đường |
| 3rd row | Nhà trong hẻm |
| 4th row | Hẻm |
| 5th row | Ngoài lộ |
| Value | Count | Frequency (%) |
| hẻm | 5463 | |
| nhà | 3654 | |
| trong | 3244 | 11.3% |
| not | 1825 | 6.3% |
| lộ | 1482 | 5.1% |
| ngoài | 1455 | 5.1% |
| mentioned | 945 | 3.3% |
| specified | 792 | 2.8% |
| xe | 689 | 2.4% |
| hơi | 526 | 1.8% |
| Other values (720) | 8704 |
Most occurring characters
| Value | Count | Frequency (%) |
| 19146 | ||
| n | 9538 | 7.4% |
| h | 9413 | 7.3% |
| t | 9153 | 7.1% |
| o | 8187 | 6.3% |
| m | 7491 | 5.8% |
| i | 6458 | 5.0% |
| N | 6141 | 4.8% |
| g | 6066 | 4.7% |
| e | 5567 | 4.3% |
| Other values (144) | 41952 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 129112 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 19146 | ||
| n | 9538 | 7.4% |
| h | 9413 | 7.3% |
| t | 9153 | 7.1% |
| o | 8187 | 6.3% |
| m | 7491 | 5.8% |
| i | 6458 | 5.0% |
| N | 6141 | 4.8% |
| g | 6066 | 4.7% |
| e | 5567 | 4.3% |
| Other values (144) | 41952 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 129112 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 19146 | ||
| n | 9538 | 7.4% |
| h | 9413 | 7.3% |
| t | 9153 | 7.1% |
| o | 8187 | 6.3% |
| m | 7491 | 5.8% |
| i | 6458 | 5.0% |
| N | 6141 | 4.8% |
| g | 6066 | 4.7% |
| e | 5567 | 4.3% |
| Other values (144) | 41952 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 129112 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 19146 | ||
| n | 9538 | 7.4% |
| h | 9413 | 7.3% |
| t | 9153 | 7.1% |
| o | 8187 | 6.3% |
| m | 7491 | 5.8% |
| i | 6458 | 5.0% |
| N | 6141 | 4.8% |
| g | 6066 | 4.7% |
| e | 5567 | 4.3% |
| Other values (144) | 41952 |
| Price_df1 | Area_df1 | Bedrooms_df1 | Bathrooms_df1 | Address | Listing ID | Date | Price_df2 | Area_df2 | Bedrooms_df2 | Bathrooms_df2 | Floors | Amenities | Street name | Ward name | District name | Frontages | Main road | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3899000000 tỷ | 150 m | 2 PN | 1 WC | Thành phố Hồ Chí Minh | 285164 | 12/10/2023 17:11 | 12,8 tỷ VND | 75m2 (3,2m x 21m) | 5 | 6 | 4 | Sân Thượng, Bếp, Mặt bằng buôn bán | Nguyễn Tất Thành | Not mentioned | Quận 4 | Not mentioned | Mặt tiền ngay trục đường kinh doanh nhộn nhịp |
| 1 | Thỏa thuận | No Area | No Bedrooms | No Bathrooms | No Address | 285126 | 12/8/2023 11:53 | 8,99 tỷ | 85m ngang 8m | 7 | 3 | 1 trệt, 1 lầu | khu vực xây dựng nhà cao tầng, ô tô ngủ trong nhà | Hòa Bình | Tân Phú | Tân Phú | 2 | ngay đường |
| 2 | 12.8 tỷ | 75 m | 5 PN | 6 WC | Đường Xóm Chiếu | 285118 | 12/8/2023 14:17 | 3.4 tỷ VND | 105m | 4 | 3 | 2 | Sân thượng, Nhà xây dựng kiên cố, Hẻm taxi tận cửa | Mễ Cốc | P15 | Q8 | 37m2 | Nhà trong hẻm |
| 3 | 3.4 tỷ | 110 m | No Bedrooms | No Bathrooms | Nhà Chính chủ hẻm taxi Mễ Cốc, P15, Q8. | 285109 | 12/8/2023 14:56 | 7.39 tỷ VND | 53m2 | 4 | 5 | 4 | Khu vực an ninh, dân trí cao, gần mặt tiền | Phan Huy Ích | Not specified | Tân Bình | Not specified | Hẻm |
| 4 | Thỏa thuận | 12 m | No Bedrooms | No Bathrooms | No Address | 285107 | 12/8/2023 15:36 | 16 tỷ VND | 4x20m | Not mentioned | Not mentioned | Not mentioned | An ninh cao, Gần công viên đi bộ, Đường 18m có vỉa hè, Hiếm nhà có đầy đủ công năng | Bùi Tá Hán | An Phú | Quận 2 | 4m | Ngoài lộ |
| 5 | 8.99 tỷ | 85 m | 7 PN | 3 WC | No Address | 285102 | 12/8/2023 16:52 | 1.x tỷ VND | 12m2, diện tích ngang: 3m; dài 4m, diện tích sử dụng: 24m2 | 1 | Not specified | 2 (1 trệt, 1 lầu) | Khu dân cư hiện hữu, không quy hoạch | Trần Xuân Soạn | Tân Hưng | Quận 7 | Not specified | Hẻm xe máy |
| 6 | 7.39 tỷ | 53 m | 4 PN | 4 WC | No Address | 285100 | 12/8/2023 17:08 | 23,5 tỷ | 118m2 | Not specified | Not specified | Not specified | Ví trí đẹp đối diện Trung tâm thương mại GigaMall, Vị trí khu vực kinh doanh sầm uất, tấp nập, Thích hợp xây cao ốc, văn phòng, khách sạn, showroom hoặc kinh doanh buôn bán, Hiện đang cho thuê 30tr/tháng, Hỗ trợ vay Ngân Hàng | Phạm Văn Đồng | Hiệp Bình Chánh | Thủ Đức | 2 | Not specified |
| 7 | 16 tỷ | 80 m | 6 PN | 6 WC | Quận 2, Hồ Chí Minh | 285098 | 12/8/2023 17:56 | 4 tỷ | 50m2 | 3 | Not mentioned | Not mentioned | Giao thông thuận tiện, dịch vụ tiện ích đầy đủ, gần trường học các cấp, gần chợ, đại học | Nguyễn Khuyến | Phường 12 | Bình Thạnh | Not mentioned | Hẻm |
| 8 | 4.75 tỷ | 50 m | No Bedrooms | No Bathrooms | Nguyễn Khuyến, phường 12, quận Bình Thạnh | 285097 | 12/8/2023 18:00 | 1.1 tỷ VND | 150m2 (5 x 30m) | 2 | 1 | Not mentioned | Sân sau trồng cây | Not mentioned | Not mentioned | Huyện Nhà Bè | Not mentioned | Hẻm nhưng xe hơi tránh được |
| 9 | 5.5 tỷ | No Area | No Bedrooms | No Bathrooms | No Address | 285087 | 12/9/2023 9:31 | 5 tỷ VND | 96m2 (4.3m x 22.5m) | Not specified | Not specified | 1 trệt, 1 lửng | Not specified | Not specified | Bình Tân | Tân Phú | Not specified | Hẻm |
| Price_df1 | Area_df1 | Bedrooms_df1 | Bathrooms_df1 | Address | Listing ID | Date | Price_df2 | Area_df2 | Bedrooms_df2 | Bathrooms_df2 | Floors | Amenities | Street name | Ward name | District name | Frontages | Main road | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9775 | 5.5 tỷ | 80 m | 3 PN | No Bathrooms | Thành phố Hồ Chí Minh | 278681 | 9/21/2023 11:18 | Giá không được cung cấp trong nội dung | 40m2, 3.9 x 10.5 | 3 | 4 | 1 trệt, 2 lầu | Phòng khách, bếp, gần chợ Phú Lâm, siêu thị, trường học, bến xe, bệnh viện | Không được cung cấp | Tân Hòa Đông | Quận 6 | Không được cung cấp | Ngoài lộ |
| 9776 | 4.1 tỷ | 33.3 m | No Bedrooms | 2 WC | Cách Mạng tháng 8, Phường 15, Quận 10 | 278673 | 9/21/2023 13:42 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 9777 | 4 tỷ | 110 m | 5 PN | 5 WC | Đình Phong Phú, Tăng Nhơn Phú B, Q.9 | 278719 | 9/21/2023 21:27 | 4 tỷ | 110m2 | 5 | 5 | Not mentioned | 2 ph ra trường mầm non, UBND, siêu thị winmart, Giao thông thuận tiện kết nối Trung tâm Thủ Đức, Cao tốc Long Thành , Q.2, Q.1 | Not mentioned | Not mentioned | Not mentioned | Not mentioned | Not mentioned |
| 9778 | 4.1 tỷ | 33.3 m | No Bedrooms | 2 WC | Cách Mạng tháng 8, Phường 15, Quận 10 | 278672 | 9/21/2023 13:42 | 4 tỷ 1 | 33m2, 5.6x6 | 2 | 2 | 2 | Nhà mới đẹp, trước chủ cho thuê 10tr/tháng, Sổ hồng có sẵn, Hỗ trợ pháp lý & Ngân hàng | Cách Mạng Tháng 8 | Phường 15 | Quận 10 | 2 | Nhà trong chợ Hòa Hưng, cách mặt tiền đường Cách Mạng Tháng 8 10m |
| 9779 | Thỏa thuận | No Area | No Bedrooms | No Bathrooms | No Address | 278718 | 9/21/2023 21:42 | 5 tỷ 5 | 80m2 (4.4x18) | Not specified | Not specified | 3 tầng | Sân để xe rộng, ban công lớn, giếng trời, gần UBND quận Tân Phú, Siêu Thị, Trường Học, Bênhk Viện, Ngân Hàng | Nguyễn Sơn | Not specified | Tân Phú | 6m | Not specified |
| 9780 | 5.2 tỷ | 40 m | 3 PN | 4 WC | No Address | 278717 | 9/21/2023 22:11 | 14 tỷ VND (thương lượng) / 36 triệu VND (thuê) | 250m2 | 5 | 3 | 2 | Thiết kế phù hợp với tiện ở và kinh doanh | Lê Văn Lương | Phường Tân Phong | Quận 7 | Not specified | Ngoài lộ |
| 9781 | 7.8 tỷ | 89 m | 3 PN | 3 WC | No Address | 278716 | 9/21/2023 22:16 | 6.68 tỷ VND | 380m2 | 4 | Not specified | 4 | Nội thất cao cấp | Trương Đức Toàn | Tân Phú | Quận 11 | 4x80m | Hẻm ô tô |
| 9782 | 6.68 tỷ | 80 m | 4 PN | 4 WC | Phường Tân Tạo A, Quận Bình Tân, TP. HCM | 278714 | 9/21/2023 22:46 | Giá không được cung cấp trong nội dung | 89m2, 6.3 x 14.5 | 3 | 3 | 2 tầng | Phòng khách, bếp, sân đậu ô tô | Đường Kênh Tân Hóa | Không được cung cấp trong nội dung | Quận 6 | Không được cung cấp trong nội dung | Ngoài lộ |
| 9783 | 4600000000 tỷ | 50 m | 4 PN | 3 WC | Thành phố Hồ Chí Minh | 278713 | 9/22/2023 6:25 | Not specified in the listing | Not specified in the listing | 4 | Not specified in the listing | 2 | Karaoke room, Steam room, Roof garden, Solar power system, Friendly neighborhood, Nearby schools and markets, Gym | Hoàng Bật Đạt | Phường 15 | Quận Tân Bình | Not specified in the listing | Not specified in the listing |
| 9784 | 7.3 tỷ | 112 m | 4 PN | 4 WC | Hoàng Bật Đạt Tỷ Phường 15 Quận Tân Bình | 278707 | 9/21/2023 8:37 | 4,6 tỷ VND | 50m2 | 3 | 3 | 4 | BTCT, nội thất, sổ hồng riêng, hoàn công đầy đủ, pháp lý chuẩn | Not mentioned in the listing | Not mentioned in the listing | Tân Bình | 2 | Nhà trong hẻm |